Translation of "It" in a Deep Syntax Framework
نویسندگان
چکیده
We present a novel approach to the translation of the English personal pronoun it to Czech. We conduct a linguistic analysis on how the distinct categories of it are usually mapped to their Czech counterparts. Armed with these observations, we design a discriminative translation model of it, which is then integrated into the TectoMT deep syntax MT framework. Features in the model take advantage of rich syntactic annotation TectoMT is based on, external tools for anaphoricity resolution, lexical co-occurrence frequencies measured on a large parallel corpus and gold coreference annotation. Even though the new model for it exhibits no improvement in terms of BLEU, manual evaluation shows that it outperforms the original solution in 8.5% sentences containing it.
منابع مشابه
مدل ترجمه عبارت-مرزی با استفاده از برچسبهای کمعمق نحوی
Phrase-boundary model for statistical machine translation labels the rules with classes of boundary words on the target side phrases of training corpus. In this paper, we extend the phrase-boundary model using shallow syntactic labels including POS tags and chunk labels. With the priority of chunk labels, the proposed model names non-terminals with shallow syntactic labels on the boundaries of ...
متن کاملTwo Case Studies on Translating Pronouns in a Deep Syntax Framework
We focus on improving the translation of the English pronoun it and English reflexive pronouns in an English-Czech syntaxbased machine translation framework. Our evaluation both from intrinsic and extrinsic perspective shows that adding specialized syntactic and coreference-related features leads to an improvement in translation quality.
متن کاملTreex - an open-source framework for natural language processing
The present paper describes Treex (formerly TectoMT), a multi-purpose open-source framework for developing Natural Language Processing applications. It facilitates the development by exploiting a wide range of software modules already integrated in Treex, such as tools for sentence segmentation, tokenization, morphological analysis, part-of-speech tagging, shallow and deep syntax parsing, named...
متن کاملExamining the Effect of Ideology and Idiosyncrasy on Lexical Choices in Translation Studies within the CDA Framework
Using a critical discourse analytic model of translation criticism, the present study attempts to explore the effect of ideology and idiosyncrasy on the lexical choices in translation studies. The study employed a descriptive approach to answer two research questions: Is there any relationship between ideology and idiosyncratic features of translators' lexical choices? And if yes, can it be ana...
متن کاملCritical Assessment of Poetic Imagery Translation in Nizami’s “Leili & Majnun” by James Atkinson
Poetry translation involves cognition, discourse, and action by and between human s and textual a c- tors in physical and social settings. The aim of this study was to find out to what extent the non - native translator of Nizami Ganjavi’s “Leili and Majnun” could preserve the poetic imagery in its English translation. To this end, an innovative taxonomic model, which could investiga...
متن کامل